Combined Optimization and Reinforcement Learning for Manipulation Skills

نویسندگان

Peter Englert

Marc Toussaint

چکیده

—This work addresses the problem of how a robot can improve a manipulation skill in a sample-efficient and secure manner. As an alternative to the standard reinforcement learning formulation where all objectives are defined in a single reward function, we propose a generalized formulation that consists of three components: 1) A known analytic control cost function; 2) A black-box return function; and 3) A black-box binary success constraint. While the overall policy optimization problem is high-dimensional, in typical robot manipulation problems we can assume that the black-box return and constraint only depend on a lower-dimensional projection of the solution. With our formulation we can exploit this structure for a sample-efficient learning framework that iteratively improves the policy with respect to the objective functions under the success constraint. We employ efficient 2nd-order optimization methods to optimize the high-dimensional policy w.r.t. the analytic cost function while keeping the lower dimensional projection fixed. This is alternated with safe Bayesian optimization over the lower-dimensional projection to address the black-box return and success constraint. During both improvement steps the success constraint is used to keep the optimization in a secure region and to clearly distinguish between motions that lead to success or failure. The learning algorithm is evaluated on a simulated benchmark problem and a door opening task with a PR2.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Dynamic Manipulation Skills under Unknown Dynamics with Guided Policy Search

Planning and trajectory optimization can readily be used for kinematic control of robotic manipulation. However, planning dynamic motor skills requires a detailed physical simulation, and some aspects of the task, such as contacts, are very difficult to simulate with enough accuracy for dynamic manipulation. Alternatively, manipulation skills can be learned from experience, allowing them to def...

متن کامل

Generalized Reinforcement Learning for Manipulation Skills – Combining Low-dimensional Bayesian Optimization with High-dimensional Motion Optimization

This paper addresses the problem of how a robot can autonomously improve a manipulation skill in an efficient and secure manner. Instead of using the standard reinforcement learning formulation where all objectives are defined in a single reward function, we propose a generalized formulation that consists of three components: 1) A known analytic cost function; 2) A black-box reward function; 3)...

متن کامل

Deep Reinforcement Learning for Robotic Manipulation

Reinforcement learning holds the promise of enabling autonomous robots to learn large repertoires of behavioral skills with minimal human intervention. However, robotic applications of reinforcement learning often compromise the autonomy of the learning process in favor of achieving training times that are practical for real physical systems. This typically involves introducing hand-engineered ...

متن کامل

Low-Area/Low-Power CMOS Op-Amps Design Based on Total Optimality Index Using Reinforcement Learning Approach

This paper presents the application of reinforcement learning in automatic analog IC design. In this work, the Multi-Objective approach by Learning Automata is evaluated for accommodating required functionalities and performance specifications considering optimal minimizing of MOSFETs area and power consumption for two famous CMOS op-amps. The results show the ability of the proposed method to ...

متن کامل

Composable Deep Reinforcement Learning for Robotic Manipulation

Model-free deep reinforcement learning has been shown to exhibit good performance in domains ranging from video games to simulated robotic manipulation and locomotion. However, model-free methods are known to perform poorly when the interaction time with the environment is limited, as is the case for most real-world robotic tasks. In this paper, we study how maximum entropy policies trained usi...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

Combined Optimization and Reinforcement Learning for Manipulation Skills

نویسندگان

چکیده

منابع مشابه

Learning Dynamic Manipulation Skills under Unknown Dynamics with Guided Policy Search

Generalized Reinforcement Learning for Manipulation Skills – Combining Low-dimensional Bayesian Optimization with High-dimensional Motion Optimization

Deep Reinforcement Learning for Robotic Manipulation

Low-Area/Low-Power CMOS Op-Amps Design Based on Total Optimality Index Using Reinforcement Learning Approach

Composable Deep Reinforcement Learning for Robotic Manipulation

عنوان ژورنال:

اشتراک گذاری